skip to main content


Search for: All records

Creators/Authors contains: "Bu, Yuheng"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. This paper proposes an optimization-based method to learn the singular value decomposition (SVD) of a compact operator with ordered singular functions. The proposed objective function is based on Schmidt’s low-rank approximation theorem (1907) that characterizes a truncated SVD as a solution minimizing the mean squared error, accompanied with a technique called nesting to learn the ordered structure. When the optimization space is parameterized by neural networks, we refer to the proposed method as NeuralSVD. The implementation does not require sophisticated optimization tricks unlike existing approaches. 
    more » « less
    Free, publicly-accessible full text available December 15, 2024
  2. Generalization error bounds are essential to understanding machine learning algorithms. This paper presents novel expected generalization error upper bounds based on the average joint distribution between the output hypothesis and each input training sample. Multiple generalization error upper bounds based on different information measures are provided, including Wasserstein distance, total variation distance, KL divergence, and Jensen-Shannon divergence. Due to the convexity of the information measures, the proposed bounds in terms of Wasserstein distance and total variation distance are shown to be tighter than their counterparts based on individual samples in the literature. An example is provided to demonstrate the tightness of the proposed generalization error bounds. 
    more » « less
  3. We generalize the information bottleneck (IB) and privacy funnel (PF) problems by introducing the notion of a sensitive attribute, which arises in a growing number of applications. In this generalization, we seek to construct representations of observations that are maximally (or minimally) informative about a target variable, while also satisfying constraints with respect to a variable corresponding to the sensitive attribute. In the Gaussian and discrete settings, we show that by suitably approximating the Kullback-Liebler (KL) divergence defining traditional Shannon mutual information, the generalized IB and PF problems can be formulated as semi-definite programs (SDPs), and thus efficiently solved, which is important in applications of high-dimensional inference. We validate our algorithms on synthetic data and demonstrate their use in imposing fairness in machine learning on real data as an illustrative application. 
    more » « less